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(57) Abstract: A method and system for intelligently directing 
a search of a pecr-to-pccr network, in which a user pcrfonning 
a search is assisted in choosing a host which is likely to letum 
fast, favorahle results to the user A host monitor monilon; the 
peer-to-peer network and collects data on various characteristics 
of the hosts which make up the network. Thereafter, a host selec- 
tor ranks the hosts using the data, and passes this information to 
the user. The user then selects one or more of the highly-ranked 
hosts as an entry point into the network. Additionally, a cache 
may collect a list of hosts based on the content on the hosts. In 
this way, a user may choose to connect to a host which is known 
to contain information relevant to the user's search. The host se- 
lector may be used to select finom among the hosts listed in the 
cache. 
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SYSTEM AND METHOD FOR SEARCHING PEER-TO-PEER 
COMPUTER NETWORKS 

Background of the Invention 

Field of the Invention 

The presaat invention relates generally to the searching of data contained within a 
computer netwoik, and, more particularly, to a system and method for searching pe^-to- 
peer computer networks by determining optimal hosts for searching. 

Discussion of the Related Art 

The computer netwoik now known as the Intemet began by individuals forming 

"links" betweoi thdr respective computers. Over time, for a variety of reasons, users 

began to access more and more information ttirough a centralized location or locations. 

Uso^' information was uploaded to servers, which were in turn accessed and searched by 

otho- users. Today, users typically access the Intern^ only through Aeir (local) smice 

provider, and companies sudi as Excite™ and Yahoo!™ provide users with search 

mgines, or information portals, which attempt to provide users with a primary access point 

for Internet searching and use. 

Although sudi centralized sites have various advantages (eg., the ability to 

provide an optimized directory to search available resources), the above Internet model, as 

a whole, suffers from a number of shortcomings. For example, such centralized access 

and search sites (especially to the extent that they may become inoperable or shut down 

for any reason), are potential single points of failure, or "weak links in the chain," to the 

flow of information. Moreover, they typically provide access to only a small portion of 
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the total resources of the IntemetOess than 1%, by some estimates, and this number will 
grow smaU^ as the Internet grows larger), and may provide links to sites which are 
outdated (i.e^ no longer available). In short, users become overly reliant on services 
do not provide reliable, effective "one-stop" Internet access and searching. 
5 As a result, "peer-to-peef ' networics, in which eveiy computer can serve as both a 

host and a client (i.e., can both provide and receive files to/fiom one another), have 
recently become more popular. Such networks link individual computers to one another, 
and are essentially file-sharing systems with limited seardiing abilities. These n^oiks 
have certain advantages over the Intomet model described above. For example, peer-to- 

10 pear networics often provide a greater number and vari^ of resources. Moreovor, links 
will not be outdated, to the extent that only those files which are currmtly connected to the 
network are searched. 

Some peer-to-peer networks, howeva-, remain largely centralized. That is, 
although usexs are connected to each other, all connections are routed to and/or througib a 

IS central location. Thus, such systems r^ain at least some of the shortcomings discussed 
above; primarily, tfiey contain an obvious choke point(s) at whidi fhe exdiange of 
infonnarion may be slowed or stopped. Moreover, although such networks have the 
potential to provide a greater number and variety of resources, it has been difficult to 
devise a searching technique for efTectivdy utilizing these resources. 

20 Decentralized peo^-to-peer networics also exist, in whidi each computer is linked 

only to other computers within the nrtwork. These n^orks provide many of the 
advantages of a centralized peer-to-peer n^work, but are much more resilient, inasmuch 
as they are not dq>aidait on any particular site or server. Howcvct, as will become 

2/34 
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apparent, a search technique whidi is efficient and effective on these networks has not yet 
been devised. 

Fig. 1 illustrates a simplified blodc diagram of a generic decentralized peer-to-peer 
network 100. In Fig. 1, a user "A" on host computer 1 10 connects to at least one other 
host, which is itself connected to at least one other host on the networic In Fig. 1, each 
host is numboed 1-5 to d^onstrate the number of connections, or "hops," between that 
host and the user host 1 10. For example, host 120 is designated '"l," as it is 2 hops away 
fiom user host 110. Host 130 is 5 hops away fiom user host 110 via one connection path, 
but is only 3 hops away via another connection path. 

A more specific example of a known decentralized peer-to-pe^ netwoik is the 
Gnutella Network (hereafter, Gnutella), windi utilizes the basic structure shown in Fig. 1 . 
To utilize GnuteUa, a user A must first connect to the network by connecting to at least 
one other host 140, as shown in Fig. L Tliis host may be selected at random, or a 
particular user may have the knowledge or desire to choose a particular host or hosts. In 
either case, the user is thus connected to a numb^ of hosts through the initially selected 
host(s). In other words, the user's connections will spread out until the number of hosts 
(approximatdy) readies a pred^ennined number of hosts (hereafter referred to as a cluster 
of hosts) which the network is deemed capable of handling. The hosts illustrated in Fig. 1 
may be thought of as sudi a cluster of hosts. 

To process a search request, Gnutella simply passes the search quay fix)m one host 
to the next, in the hopes of finding the searched-for data on a host which is only a few 
"hops" away. Thus, the query wiU not reach beyond the user's isolated cluster of hosts, 
which contain only a limited amount of content (especially if the user chose poorly in 
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selecting his or her initial host connection). This results iq poor search results, despite the 
availability of content in the broader n^work. 

Moreover, the exponential manner in which qu^es are passed from one host to the 
next can easily result in many or all of the hosts being virtually dedicated to nothing but 
5 the activity of passing along queries and query results for other hosts, with little time or 
abUity left over for any otho- functionality. Qeariy, this shortcoming causes each host, as 
well as the network as a whole, to operate significantly slower than at optimum speed. . 

Additionally, in peer-to-peer n^orks in g&icrai^ hosts periodically connect and 
discormect, so that the availability of hosts is constantly in flux. In other words, although 

10 links in a peer-to-peer n^oric will not be stale or outdated in the traditional sense (as 
mentioned above), it is possible that, even if a given host still contains the desired 
information, the host will be disconnected finom the network when a user seeks to access 
this information. Also, a host could disconnect from the system during a download of 
search results. This instability further d^eriorates the reliability of searches on the 

15 network. 

Finally, snice hosts in Gnutdla and otha: peer-to-peer networks are sdected 
blindly, fliere is no way of using geogrqilucal location of the other host(s) as a factor in 
host selection/searching. In other words, priw art pear-to-peer networks wfll show that a 
givm host is directly connected to the user (aiui therefore seemingly a good candidate for 
20 access), but will not demonstrate the fact that the host may be geographically very distant 
from flie user. As a result, the transfer of irtfonnation is inefficient in sudi networks; for 
example, a time required to search and download files may become inordioately long. 
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What is needed is a system and m^od for effectively and efBciently seardiing a 
decentralized peer-to-peer network, in which the likelihood of fast, favorable search 
results is inoeased, and the stability of the network is improved. 

5 Summary of the Invention 

A system and method for searching a decentralized peer-to-peo' network according 

to an embodiment of the present invention utilizes intelligent host selection to incaiease the 

chances of fast, favorable seardi results (i.e., residts which are useful to the searcher) and 

to provide a more stable network environment 
10 In particular, the present invention optimizes the starting points (i.e., starting hosts) 

for distributed search queries by directing queries to hosts that provide the best chance of 

either housing the content or being linked to a group of hosts ^t contain tiie content 

In order to adiieve the above, the present invention monitors the hosts within the 

network over a period of time, and collects a large and dynamic set of data. Using this 
15 data set, the present invention ranks the monitored hosts according to which ones are most 

stable and most fikdy to omtain fevorable search results. Thereafter, the present 

invention routes search queries to the most hig^y-ranked hosts. 

Thus, a user is generally directed to a cluster of hosts deemed most likely to r^um 

fast, &vorable results. However, the user can request to be re-cormected to another 
20 (highly-ranked) host cluster if that usct wishes to seardi for more or different results. 

Alternatively, the user could be periodicaUy recormected to another host cluster as a matter 

of course, in order to assure the broadest search possible. 
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Examples of the collected data used to rank the hosts include the numba- of files 
on a host and the number of kilobytes stored on a host This data is useful because hosts 
with high levels of content are good starting points for distributed queries. Similariy, 
hosts that are either connected to hosts with high levels of content, or are close to such 
5 hosts, are good starting points for queries. 

Additionally, the data set may include flie frequency with which a particular host is 
connected to the n^woik, as well as the reliability of that host's connection. In this way, 
seardi queries can be directed to obtain hosts that are deemed stable, so that Ike user 
operates in a more stable environmoit 

1 0 Hie data set may also include content-spedfic data (sudi as file type or topic). For 

example, a cache memory can store such content-spedfic data, along with a n^oric 
location of hosts whidi contain the data. This data can be collected by, for example: (1) 
intercepting queries to and. fi:om other hosts within flie network, (2) using a user's previous 
seardi results, or (3) using results fiom periodically-posed common queries to the 

15 netwoik. Prefonably, a user*s seardi query to sudi a cache memoxy should subsequently 
be directed only to tbose hosts whidi are connected to the n^ork at the time of a uso-'s 
seardL In this way, usexs can quiddy locate connected hosts whidi previously proved 
useiul in returning fevorable results on a specific topic or file-type. 

The data collected on the various hosts should include data conceming the 

20 geographical location, as well as network connectivity information and network location 
of the host(sX so that a user may connect to hosts whidi are as close as possible to the 
user. Preferably, tibis location data should be collected by spreading &e data-collecting 
fimctionality to various geographical locations whidi are as close as possible to a 

6/34 
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particular user. In one embodiment of flie invration, virtually the entire collecting, 
ranking and storing fonctionality of the invention may be performed by each user. 

Other types of data to be collected for use in host ranking, and various 
methodologies for ranking the hosts based on the data, are discussed in more detail below. 

Other features and advantages of the inv»tion wiO become ^parent from the 
following drawings aiKl descriptioiL 

Brief Description of the Drawings 

* The present invention is described with teferrace to the accompanying drawings. 

In die drawings, like reference numbers indicate identical or functionally snnilar dements. 
Additionally, the left-most digit of a reference number identifies the drawing in wbidx the 
reference number first appears. 

Fig. 1 illustrates a conventional decentralized peer-to-peer network. 

Fig- 2 illustrates a network overview of an embodiment of the present invration. 

Fig. 3 is a more detailed view of an exemplary host monitor such as the one shown 
in Fig. 2. 

Fig. 4 is a flow chart illustrating an exemplary mediodology of an embodiment of 
Ae present invention. 

Detailed Description 

The present invention is directed to a system and me&od for effectively searching 

a peer-to-peer network in a stable network enviroimient While the present invention is 
described below with respect to various ^cplanatory ^bodiments, various features of the 
present invention may be extended to other applications as would be apparent 
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Fig. 2 illustrates a system overview 200 of one embodiment of the present 
invention. Although the various system componrats appear to be extonal to network 100^ 
it is important to note that fliis is for Ae sake of illustration only. That is, all of the system 
components may be connected to and within network 100, and may therefore send, 
5 transtnit or respond to qu^es fii>m any other host within the network. In particular, user 
210 should be tiiought of as just as an example of any (potential) host within the ndwoik 
100. 

In Fig. 2, peer-to-peer network 100 may be a known decentralized peer-to-peer 
n^oric User 210 can access network 100 directly, for searching and other uses. 
10 However, according to tiie present invention, usor 210 also receives information on 
intdligent and iq>timized host sdecticMi, to thereby dramatically inqm>ve the user's seardi 
time and results ^en perfonning seardies for files throughout n^ork 100. 

Host monitor 220 is responsible for collecting data on tiie hosts within network 
100. More specifically, host monitor 220 collects status information about the hosts, such 
15 as tiie connectivity status of the hosts to tiie netwoik, the amount of content on the hosts 
which is available to tiie network, etc. Generally, host monitor 220 actively collects up-to- 
date status information on tibe hosts within network 100. 

In one embodiment, host momtor 220 contains profiler 230 and statistics database 
240. Profiler 230 periodically sends a data collection signal(s) into n^otk 100, and 
20 collects corresponding status information in statistics database 240. 

Host selector 250 receives data from host monitor 220 and ranks the hosts within 
network 100. That is, the hosts within network 100 are ranked according to a criteria 
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(based on the coUected status infonnation) which determines the most useful hosts for a 
particular user. This ranking oiteria may vary according to the needs of a particular usct. 

Cache memory (hereafter, cache) 260 stores information about the content of hosts 
within the network, as opposed to the status information collected by host monitor 220. 
5 That is, cadie 260 stores information on the type of files available from a particular host 
(for example, JPEG files), and/or topical information available from a particular host (for 
example, files containing recipes). The content information can be collected in a variety 
of ways, but is gcnerdUly collected passively, and, Aoiefore, may become outdated (for 
example, a host containing certain OHitent may discoimect fix)m the network). 

10 Thus, based on the above desoiption, it is apparent that a user 210 who widies to 

initiate a search of network 100 can receive a snapshot of the topology of network 100 
fipom host selector 250 and cache 260. This infonnation will guide die us^'s search, 
allowing the user to intdligently choose ahost or hosts which will be most likely to return 
fast, favorable results to the user. 

1 5 Preferred embodiments of host monitor 220 will now be discussed in greato- ddail. 

Altfaougji conventional decentralized peer-to-peer networks have limited ability to 
gather data concerning the network, &ese statistics are not sufficiently helpfol or reliable, 
and do not assist at all in intelligmt host selection. For example, Gnutella provides the 
numbo' of hops between hosts. However, a host can be directly connected to another host, 

20 yet the machines the hosts run on may be on the opposite sides of a continent Also, 
Gnutella provides no statistics on the stability of a particular host or hosts. Hence, 
Gnutella statistics can be very misleading. 



9/34 



wo 02/15035 



PCT/USOl/25096 



Therc are many statistical measures of hosts within netwoik 100 whidi can be 
measured by host monitor 220 to provide status information about the hosts, and thereafter 
be salt to host selector 250. The foUowing is an exemplary list of statistical measures 
which can be monitored and collected by Host monitor 220* 
5 Round trip time (hereafter, rtt): Hiis measure is defined by the time it takes a 

quay fix>m the profilor 230 to return a result fiom the host being profiled. Rtt may be 
measured using a ping. Ping, as is conmionly known, is short for Packet Intern^ Groper, 
and is a utility to diamine whether a specific P address is accessible. It works by 
sending a packet to flie specified addbess and waiting for a reply. In* gmeral, a packet is a 

10 piece of a message within a packet-switching protocol, which is a protocol in which a 
message is broken into pieces (packets) to be srat separately to a destination, wh^ they 
are recompiled. Advantageously, packets contain their destination address, as weD as any 
data to be transmitted. ICMP, short for Internet Control Message Protocol, supports 
padcets containing error, control, and informational messages. Thus, for example, the rtt 

15 can be determined based on the average of tirree ICMP pings. 

It is important to note that rtt for a particular host relative to a particular user is 
dependoit on ^ere the profiler is geogr^hically located. Hence, in a preferred 
embodiment, multiple profiles are maintained in ronote locations; for example, in 
different sections of a country, or within a predetermined distance of a user. This can 

20 provide information related to tiie physical location of the host 

Bandwidth (bw) - The bandwidth is a measure of flie throughput of a host being 
profiled; i.e., its ability to receive, transmit aiul/or respond to a particular amount of data in 
a particular amount of time. Bw can be measured based on doing 2 ICMP pings, with 
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different lengfh of the payload, and detennining the impact on the return time. Like rtt, 
bw is dq>endent on where the profiling machine is located, due to the interconnecting 
netwoik. For instance, the bw between two hosts within an intranet is likely to be higL 
However, the bw between a host inside an intranet and one outside the netwoik is typically 
smaller, since it is limited by the smallest interconnecting pipe bdween the two hosts. 

Gnutella round trip time (grtt) — This is the time it takes for a Gnutella ping to 
return to the profiling machine. A Gnutella ping is simply a type of ping used by Gnutella 
to obtain the Gnutella topology. That is, an ICMP ping travels through the Intonet 
topology, while the Gnutella ping travels through the Gnutdla topology (based on the ad- 
hoc interconnections between hosts). 

Number of files shared (nf) - TTiis measures flie number of files shared ^.e., made 
available to the rest of the network) by a particular host 

Number of kilobytes shared (nk) - This measures the numbo' of kilobytes shared 
by a particular host 

Hops away from profiler(s) (hops) - This measures flie approximate number of 
hosts between the profiler and a particular host 

Number of hosts connected (nh) — Tliis measures the number of hosts coimected to 
a particular host 

Likeness Score flh) - This measures how many times in the last 60 minutes that a 
particular host is alive. 

Reachability of Host frh) - This d^ails how many times the host monitor has 
successfiiUy cormected directly to a particular host 
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It would be most desirable to collect all of the abpve data, along with additional 
data (and have the data be current to the secsond or better), fiom every host on the netwoik. 
This woidd allow the best selection of hosts for searching. However, as a practical matter, 
to do so could overly tax flie hosts and/or network. For instance, performing ICMP pings 
5 on all the hosts would result in a flood of complaints fix}m the administrators of these 
madiines. Therefore, it is preferable to collect only the data necessary to gain a desired 
improvement level in seardiing, to tfaereiby avoid overiy taung the hosts and/or network. 

For exaii^>le, once enough historical data tias hem obtained, stable and unstable 
hosts can be identified. Stable hosts may rK>t need to be monitored with the same 
10 frequency as unstable hosts. For example, stable hosts can be assessed rdatively 
infrequently, for exanq>le, eveary week. In contrast, unstable hosts can be checked more 
often, for example, once per day. 

Similarly, host monitor 220 may ping a limited number of machines fiom a certain 
network, and extrapolate results to the rest of fiie network. This mediod can avoid the 
1 5 need to ping all other machines fiom that same network. 

Also, within Gnutella, a Gnutella ping can be used to gather other relevant 
statistical measures. For example, a Gnutella ping can be sent ev^ 5 minutes. 
Additionally, ping messages (mare jpredsdy pong messages; Le., &e ping messages which 
are returned fincMn a host) that are routed through the network can be used to extract the 
20 Gnutella netwoik topology. That is, as mentioned above, host monitor 220 can be thought 
of as being within fiie n^work, and, thmfore, receives and passes the various quoies 
whidi are constandy being transmitted by all hosts. Thus, these messages (i.e., the ping 
messages that are being broadcast by other Gnutella hosts) can be monitored, in order to 
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decrease the frequency with which pings are sent by the preset invention. For instance, a 
host that sent out a ping and a host that responds to a ping, are clearly both alive, and will 
not have to be pinged again soon. 

An example of host monitor 220 which efficiently collects data is shown in more 
detail in Fig. 3. In Figure 3, Rtt/Bw Measurer 310 runs every day to measure rtt and bw of 
unstable hosts and every week to measure the rtt and bw. of stable hosts. Also, GnuteUa 
Pinger 320 runs every 5 minutes to obtain grtt, nf, nh, nk, hops. As discussed above, 
GnuteUa Pinga- 320 also serves to snoop Gnutella pongs (i.e., to passively monitor 
returning pings whidh are sent by other Gnutella hosts) to help decrease the frequency of 
(txe pinging. 

Although the above statistical measures can be grouped acoordiDg to which is most 
important to a particular user, it is also true that, in general, rtt and bw are very important 
in making host selection decisions. This is due to the &ct that users of tihe network can be 
located in geographically rCTotote locations. Hence, for example, when a user from the east 
coast wants to download or share files, it is best to use servants that perform most 
efBdently for the east cost Thmfore, in one embodiment of flie invention, a plurality of 
profilers are used, and each one is in a location which is geogr^hically remote fixmi the 
others. 

In a frirtfaer embodimmt for effidmtly collecting the network data, the amount of 
data to be collected may be reduced by dynamically identifying hosts which are important 
"hubs" in the n^work, and concentrating on those hosts (for example, collecting data 
about fliese hosts every 5 nodnutes, and collecting data on remaining hosts less firequmtly). 
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For example, ttie host monitor may dosely monitor a predeteraiined number of 
hosts out of the total number of hosts within the network, and periodically track pongs 
from these hosts. Subsequently, these hosts can be ranked based on their various 
diaractenstics (e.g., nh), so that only a certain percentage of these hosts (e.g., the top half) 
5 need be retained as hubs. 

Thereafter, at less frequoit time intervals, a certain number of the (most lowly- 
ranked) hubs can be removed from the list of hubs, so ftat the process can be repeated. 
That is, die pred^omined number of hosts within the network may be nK>nitoied and 
ranked again, resulting in a new set of hubs. Specifically, a new sei of hosts for 

10 monitoring mig^ be chosen randomly, or based on the immber of hops they are away 
from the curreot set of hubs (i.e., the higho: Ae number of hops from tiie current hubs, the 
b^ter the coverage of the netwoik will be). This r^lacement process need only incur 
infrequently; for example, sevoal times a day or less. In this way, as the n^work 
dianges, the hubs of the network will also diange, and the host monitor will dynamically 

15 reconfigure itself to Ae new network topology. Thus, the network can be efficiently and 
effectivdy monitored. 

It is in4>oftant to note that all of the collected data discussed aboye contains a 
Network location** of a corresponding host This allows Ifae present invention to correctly 
assodate a particular (set of) statistics with tiie proper host, for lat^ direction of (for 

20 example) seardi quedes. Hence, the host monitor may identify a host having a certain rtt 
and/oT bw value as having a particular IP (Internet Protocol) address. However, the 
network location should iK>t be confiised with the geographical location referenced above, 
wfaidi refers to an actual, physical location of a host computer. 
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In summary, host monitor 220 coUects data conconing the current status of hosts 
within network 100, as well as the corresponding netwoik location of the hosts. 
Gmerally, this process occurs actively (e.g., the profiler 230 sends out an ICMP ping and 
receives it upon its return), but can also occur passivdy (e.g., Gnutella Pinger 320 snoops 
Gnutella pongs), in the interest of efBciracy. In eitho- case, fhe data is preferably as up- 
toHiate as is reasonably possible. Thus, the host monitor according to tibe present 
invention collects a sufEcient amount of data necessary to allow intellig^t host selection, 
while minimizing the impact of host monitoring on the network. 

Preferred embodiments of host selector 250, operating in conjunction with host 
monitor 220, will now be discussed in greater detail. 

In geaend, host selector 250 receives the statistics collected by host monitor 220, 
and drtennines the rank of eadi of the hosts by qiplying weigihts to each of the crit^a for 
each profiled host 

The statistics are combined to obtain a host rank based on the diaract^stics of a 
"good host" For example, a host rank may be determined as follows. 

First, the desired characteristics of a **good host^ may be defined as: 

1. 0.1ms(rtt) 

2. 8Mbps (bw) 

3. grtt is not considered 

4. 250 files shared (nQ 

5. l^OG shared (nk) 

6. 2 hops away (hops) 

7. 20 hosts connected to it (nh) 
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8. 12 (Ih) (ie, alive 12 times in past 60 minutes, if period of dieddng is 5 
minutes) 

9. 1 (tfa) (ie, actually connected to host in past day) 

Then, for the measures bw, n( nk, nh, Ih and ih, hosts that have exactly the value 
5 of a "good host^ are gp/en a score of 1 for the statistic. Hosts with lower values (than the 
benchmazk give& for a good host) are penalized, and hosts with higher values are 
rewarded. For instance, if a host has a bw value of 4Mbps, it will get a bw score of 0.5 
(4Mbps/8Mbps). The reward (and/or penalty) may decrease (increase) as the value 
increases (deoeases) beyond a colain point, for example, in eidier a linear or exponcotial 
10 feshion. 

Conversely, for the measures of rtt and hops, hosts with hi^o: values are 
poialized, and senrants with lower values are rewarded. The reward/penalty function can 
again be correspondingly adjusted, as refored to above. 

Subsequently, each of the above-d^ermined scores forbw, n^ hk, nh, rtt, hops, Ih 
15 and ifa is assigned a weight Then, die overall score (rank) is obtained by applying a 
weight to each measure, as follows: 

Host rank = Bw_weight*Bwj5core + nf_weight*nf_score + . . 



20 In this way, Ae hosts can then be ranked by their respective host rank scores. 

Also, for example, fte hosts that were alive in the last ten minutes can be ranked first, and 
then the hosts that were alive in die last hour can be appended to the list Thus, hosts most 
likely to be available are preferred. 
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Note that the above fcmnula can be manipiilated based on the needs of the user 
performing a host sdection. For instance, for two host selection s^ers, one may weight 
rtt and bw more over the number of hosts connected, whereas another host selection server 
may weight more highly the number of hosts connected, and so on. 

Additionally, a host which is very limited in one aspect may never&eless be v^ 
valuable. For example, a host may contain no searchable content whatsoeva* (nf = nk = 
0); nevertheless, that host may be highly ranked if it is connected to a large number of 
otfaor hosts (and particularly if these hosts contain a large amount of content). In other 
words, the host rank does not have to depead on the intrinsic properties of the host itself 
but can be based on connection properties as well. Such values can be measured by 
recursive propagaticm. To illustrate this point, consider the netwoik 100 shown in Fig. 1, 
and consider tihat host monitor 220 sends a ping from the location of host 120 to host 140 
(i.e., one hop, and disregarding host 130 for the momoit). Hie result may be that host 140 
contains little or no contait Howcvct, siding a ping two hops from host 120 returns a 
result of at least three other hosts, 110, 150 and 160, all of which may contain a large 
amount of informati(HL Ilierefore, host 140 may be hi^^y ranked. This process can be 
extended by sending a ping out three hops, four hops, etc. 

Additionally, it may be preferable to periodically select hosts based only cm hops, 
in Older to increase the coverage of the hosts selected. In otha- words, even if intelligent 
host selection is performed as described above, a user wiD be limited to the cluster of hosts 
corresponding to the selected hosts (as conceptualized by the host cluster of Fig. 1). This 
is because the number of hosts to which a query may be passed is limited by the limits of 
the netwoik and the exponential nature of the query circulation. 



17/34 



wo 02/15035 



PCT/USOl/25096 



For example, in Gnutella, messages are usually given a time-to-live (ttl) of seven. 
That is, if a message has been foiwaided seven times, the host curraitly processing the 
message drops it Otherwise, the mmiber of connected hosts woiild grow to rapidly for the 
user and/or the network to manage. Thiis, the user is effectively limited to a cluster of 
inter-coimected hosts, so that hosts that are, for example, ten hops away fiom a uso- host 
are usually inaccessible to that host However, these inaccessible servants may be 
accessible to savants that are seven hops away in a different direction. Thus, it may be 
beneficial to effectively give a user access to a separate cluster of hosts by periodically 
selecting ahigihhops host 

Also, the user could be g^ven Ae option of simply dioosing to jump to another host 
cluster. That is, the vser could dioose to simply continue a search fiom a new starting 
point of anotho:, also highly-ranked host This could also be adneved automatically, by 
simply periodically moving the user to a new starting point (i.e., new starting host). 

In summary, the host selector 250 serves to combine the results of the profiler 230 
(as collected in statistics database 240), and thereby compute host rank. The host selector 
may also jolt Ae system evoy so ofta using the high-hop tedmique desoibed above, or 
may allow the user to search 6om a new starting point, as desired or necessary. 

Additionally, in order to improve the speed and quality of search results received 
by a usor, a preferred embodiment of the present invention onploys a cache 260, which 
will now be discussed in greater detail. 

Genoally, cadie 260 collects content information related to hosts within die 
network 100, such as specific types of files or types of topics whidi are available for 
seardiing on the hosts. As shown in Fig. 2, cache 260 may include a list of keywords 
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related to a specific topic, such as '"recipe,'' as well as a network location of hosts A and B 
which contain information on this topic. 

In a preferred embodimoit, cache 260 passively collects this information by 
intercepting queries and responses sent by olik&r hosts within the n^oik. hi other words, 
the cache 260, inasmuch as it is simply another host within the network, must receive and 
transmit qu^es/responses from other connected hosts. In so doing, the cadie 260 may 
record whidi hosts contain specific contoit data (e.g., snoop Gnutella pongs for contoit 
data, as described above with ref^ence to fhe collection of status data by host monitor 
220). 

This content information will be collected sporadically, to the extent fhat the cadie 
260 cannot control which queries are sent and responded to by other connected hosts. 
Therefore, over time, flie content information may become outdated For example, 
perfa;q>s the recipe information on host A will be removed from that host, or host B may 
simply be discormected from the network. Thus, in one CTibodiment, contents of cadie 
260 are only stored up to a maximum of some predetermined period of time. However, to 
guard against the deletion of certain common or desired content information, the cache 
260 may periodically send a query concerning that content to the network 100, and 
thoeafta store the result 

AdditionaUy, cadie 260 may rely on host monitor 220, through host selector 250, 
to provide information on wh^er a psCrticular host is currently coimected to the n^ork 
100. In this way, statistics collected by host monitor 220 serve to effectively filter out 
unstable or disconnected hosts from cache 260 when it responds to a particular user 
request 
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Still further, the host selector 250 may serve to rank the hosts stored within cache 
260, iising flie tedmiques described above with reference to the host selector. For 
example, at a given time, cache 260 may store information that twmty hosts contain 
information on recipes, or tw&aiy hosts contain JPEG files. From these twenty, ten may 
5 be ronoved (i.e., filtered out) because Aey are currently inactive, or disconnected The 
remaining ten may be ranked according to the statistics (status information) collected by 
the host monitor 220 and ordered by the host selector 250. In this way, a user may choose 
the top one or two hosts, windi are known to contain (or have access to) a large amount of 
the type of infimnation desired, and ^ch can be quickly and conveniently accessed by 
10 theuser. Thus,theuseriiiayefiBxdvelyfonn a sub-network fit)mt^^ 

where the sub-network contains only hosts having the topic or type of files \^cfa the usar 
finds most useful. 

In yet another exemplary embodiment, the information collected by the cadie need 
not be entirdy delded after a i^etmnined period of time. Instead, for example, the 

15 information relating to the type of file(s) avaflable on a particular host or hosts may be 
separately saved and analyzed. In this way, over time, a topology ofhosts whidi routinely 
make available certain file-types can be created and stored 

To practice this embodiment, a user mig^ first send a query to the cache itself as 
described above. Additionally (or alternatively), die user may detomine a host using the 

20 topology ofhosts just discussed, where this topology may be saved and accessed as part of 
the cache, the host monitor, or by an entirely difTerent component of the invention. In this 
way, again, flie user may direct queries to those hosts which are very likely to contain the 
types of files for whidi flie user is seardiing (for example, JPEG files). Also as before, 
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the topology of hosts just refeired to may be filtered/ranked by the host selector, so that a 
user may further increase the diances of a fast, favorable result. 

In summary, cadie 260 (in conjunction with host monitor 220 and host selector 
250) allows a user to initiate a search of the n^ork based on the content of the various 
5 hosts wifliin the network, rather than just the status of the various hosts. Thus, the user is 
more likely to receive fast, favorable results. 

Figure 4 illustrates an exemplary methodology 400 by which a user may practice 
the present inventioiL In step 405, as discussed above, the host monitor and cache collect 
status and content infonnation, respectively, about network 100. For the host monitor, this 
10 process is generally performed penodically and actively. For the cache, the process is 
generally passive, and occurs as indfonnation becomes available. 

In stq> 410, the host monitor ou^uts its collected status infonnation to the host 
selector 250. The host selector uses this data in step 415 to compile a list of, for example, 
ten hosts whidi are most likely to provide favorable search results. 
15 Thereafter, a user may cormect to the host selector in step 420, in order to receive 

the list of the ten hosts (Le., their IP addresses). The user uses this information in step 425 
to connect to one or more of flie hosts. 

At this point, the user may choose to seardi the network in step 430, using the 
provided host or hosts. As discussed above, such searching may include penodically 
20 restarting tiie seandi with another host or cluster of hosts. If this method returns desired 
results in step 435, the user may wish to end die seardi in step 440 (the user may of course 
continue seardbing the selected hosts as long as he or she desires). 
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If this method does not return desired results^ the user may send a search query to 
the cache in step 445. In a prefened embodimoit, the user may send a seardi query to the 
cache in step 445 immediately after connecting to the host(s) in step 425. 

In step 450, it is determined whether the cadie contains a host which may contain 
5 the desired information. If not, the user may continue in step 430 searching hosts provided 
by the host selector. Howevo", if it is d^ermined that sudi a host is stoied wi&in the 
cadbe, then the connectivity status of the host is checked in step 455, using statistics 
provided by the host monitor. 

If fte host is not cuxrendy connected to the network, the presence of another host 
10 vn&in fhe cache may be checked in step 450. However, if the host is curroitly connected, 
&e query may be sent to Aat host in step 460. Receiving a desired result in step 465 ends 
the flow in step 470. Otherwise, the user must return dither to another cached host in stq> 
450, or else to the hosts provided by the host selector in step 430. Of course, the uso- may 
stop the flow at any time simply by disconnecting from the netwodc. 
15 When implemnting the embodiment of the invoition as described above, it 

possible to include all of host momtor 220, host selector 250 and cache 260 at a single, 
ronote location with respect to all users. However, as already noted, it is preterabie to 
utilize a plurality of geographically remote profilers, in order to determine and make use 
of hosts which are dosest to a particular user. 
20 Additionally, it may be preferable to include some, or even all, of the functionality 

of the present invention at the location of a particular user. In other words, depending on 
the tiser's access and available resources, it is possible to include a host monitor, host 
selector and cadie on a client computer. In this embodiment, since the resources of such a 
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client computer are likely to be limited, various steps can be taken to reduce the amount of 
resources necessary to implement this mibodiment of the invention. For example, such a 

user may only monitor hosts periodically, rather than constantly, or may only monitor a 

I. 

certain number of relatively local hosts. Similariy, the cache employed by the user could 
be more limited. 

In this embodiment, us^ may nevertheless send data conceming the hosts to a 
central site. Thus, if a plurality of users were to participate in this maimer, the central site 
would be able to construct an excellent estimate of the network as a \^o]e, by combining 
the information provided by local users about their local networks. 

As is evident fiom the above, the present invention assists a user in podfonning a 
seardh of a decentralized peer-to-peer network by directing that user to fhe most pertinent, 
reliable hosts which are cunmtly available on the network. The selected hosts are also ihe 
ones capable of returning results most quickly (e.g., are closest to the user). Thus, the 
user's seandi time is reduced, and die odds of favorable results are increased. Moreover, 
the stability of the network (as seen by the user) is increased, and &e number of queries 
passed through the network is reduced. 

While this invention has been desoibed in a pr eferred anbodiment, other 
embodiments aiid variations can be effected by a person of ordinary skill in the art without 
departing fiom the scope of the inventioiL 
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In the claims: 

1 1. A method for searching a peer-to-peer computer network comprising: 

2 - collecting data about a plurality of computers within the network, including a 

3 network location ofeadiofthepluraUty of computers; 

4 selecting at least one computer to be a selected computer, based on the collected 

5 data; and 

6 routing search queries from a user to the sdected computer. 

1 2. The mrthod of claim 1, wherein said collecting data about a pluraUty of computes 

2 wifliin' the network further comprises: 

3 sending a signal to at least one ofthe plurality of computes; 

4 receiving Hie signal upon its return from the at least one computer; and 

5 forming a profile cfaaractaizing the at least one computer, based on information 

6 provided by tbesignaL 

1 3- Themethodofdaim2, wherein the profile comprises a n)und trip time taken by 

2 the signal during its travel to and from the at least one computer. 

1 4. The meOiod of claim 2, wherein the profile comprises information on the number 

2 of files contained within tibe at least one computer. 

1 5. The method of claim 2y wheiein the profila comprises informatiou on the amount 

2 ofccmtentavailableto die network on the at least one computer. 

1 6. The meOiod of claim 2, wherein the profile comprises information on the capability 

2 ofthe at least one compute to process a search query. 

1 7. The me&od of claim 2, wherein the profile comprises information on the number 

2 of connected computers encountered by the signal during its travel to and from the at least 

3 one computer. 
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1 8. The method of claim 2, whoein the profile comprises infonnatioD on the number 

2 of additional computers connected to the at least one computer. 

1 9. The m^od of claim 1, wherein the profile comprises inforaiation on a fi-equency 

2 with which the plurality of computers are connected to the network. 

1 10. The method of claim 1, wherein the profile comprises infonnation on whidh of the 

2 plurality of computers are currently connected to the network. 

1 11. The method of claim 1 , wherein said collecting data about a plurality of computers 

2 within the netwoik fiirtfaer comprises: 

3 collecting a plurahty of statistical measures which diaracterizeeadi of the plu^ 

4 of computers, 

5 and wherein said selecting the selected computer based on the collected data 

6 furttier comprises: 

7 assigning a weighted score to eadi statistical measure for eadi of the plurality of 

8 computers; 

9 combining the weighted scores to obtain a rank for eadi of the plurality of 

10 computers; and 

1 1 ranking the plurality of computers according to the resulting ranks. 

1 1 2. The method of claim 1 » wherein said collecting data about a plurality of computers 

2 within the n^ork fiirther comprises: 

3 monitoring data exchanges which occur between the plurahty of computers. 

1 13. The method of daim 12, fiirtfao* comprising: 

2 storing the collected data in a memory, wherein at least a portion of the coUected 

3 data is content data which comprises information on the content available for searching on 

4 the plurality of computers. 

1 1 4. The method of daim 1 3, fiirther comprising: 

2 removing the content data after a predeteraiined period of time; 

3 sending common user search queries into the network on a periodic basis; and 
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4 Storing the results in the memory. 

1 15. The method of claim 13, wherein said storing the content data in a memory 

2 comprises: 

3 choosing a portion of the content data to store based on previous user requests. 

1 16. The metiM>d of claim 13, ^diwein said collecting data about a plurality of 

2 compute within the network further comprises: 

3 monitoring a current connectivity status of each of the pluiality of computers, 

4 and wh^ein said selecting at least one computer to be a selected computer based 

5 on the collected data fitrfh^ comprises: 

6 ' selecting the selected computer based on the content data and Ae current 

7 connectivity status. 

1 17. The method of claim 16, wherein said collecting data about a plurality of 

2 computers within the network finther comprises: 

3 collecting a plurality of statistical measures which characterize each of flie plurality 

4 of computers, 

5 and who-eln said selecting the selected computer based on the collected data 

6 further comprises: 

7 assigning a weighted score to each statistical measure for each of the plurality of 

8 computers; 

9 combining the weigjbted scores to obtain a rank for each of the plurality of 

10 compute; 

11 ranking the plurality of computers acccmiing to the resulting ranks; and 

12 selecting the at least one computer based on the contrat data, the current 

13 connectivity status and Ae ranks. 

1 1 8. The method of claim 13, fiirther comprising: 

2 storing a portion of the content data which identifies a type of file available for 

3 seardiing on the plurality of computers; and 
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selecting the selected compute: based at least in part on the stored file-type content 

data. 

19. The method of claim 1, wherein said selecting at least one computer to be a 
selected computer further comprises: 

sdecting at least a second selected computer based on the data, 
and wherein said routing a search query &om a us^ to the selected computer 
further comprises: 

routing a seardi qu^ 6om the user to the second selected computer after a 
predetermined period of time, or in response to a user request 

20. The method of claim 2, wherein said sending a signal to at least one of the plurality 
of computers further comprises: 

sending the signal firom a plurality of geographical locations whidi are remote 
from one anodier, whoein the geographical locations are selected based on their 
respective proximity to a plurality of users. 

21 . The m^hod of claim 1 , \^erein said coUecting data about a plurality of computers 
vnUnn the ndwork is performed periodically, so that the collected data is approximately 
current 

22. The method of claim 1, wherein said collecting data about a plurality of computers 
within the network fur&er comprises: 

collecting data about a predetermined number of the plurality of computers at a 
first predetermined time interval; 

ranking the computers based on the collected data; 

retaining a set of hub computers whidi make up a predetermined pm^tage of the 
most highly-ranked computo^; and 

collecting data about only the set of hub compute at a second predetermined time 
interval whidi is smaller than the first pred^ermined time interval. 
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1 23. A system by wfaidi a user may establish an optimal comiection to a peer-to-peer 

2 computer network, comprising: 

3 a monitor whidi measures data about a plurality of computers within the network; 

4 and 

5 a selector which selects at least one computer to be a selected computer, based on 

6 the measured data, and which outputs a network location of the selected computer to the 

7 user, to thereby allow the user to connect to the selected compute. 

1 24. The system ofdaim 23, wherein said monitor further comprises: 

2 a profiler which collects the measured data by sending a signal to at least one of 

3 the plurality of computers and receiving the signal thCTefirom, to thereby form a profile of 

4 the at least one ofdiephnrality of computers; and 

5 . a database whidh stores the collected data. 

1 25. The system of daim 24, wherein the profile conqnises a round trip time taken by 

2 die signal during its travel to and fiiom the at least one computer. 

1 26. Ibe system of claim 24, wherein the profile comprises information on the number 

2 offiles contained within the at least one computer. 

1 27. Ibe system of claim 24, wherdn the profile comprises information on the amount 

2 of content available to netwcnk cm the at least one conq)uter. 

1 28. The system of claim 24, \(^aein flie profile comprises information on the 

2 cq>abiIityofthe at least one conqnitertoi^ocessaseardiquary. 

1 29. The system of claim 24, whoein the profile comprises information on the number 

2 ofcoimected computers encountered by the sigrial during its travel to ai>d from the at least 

3 one computer. 

1 30. The system of claim 24, wherein the profile comprises information on the number 

2 of additional computers coimected to the at least one computer. 
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1 31. The system of claim 24, wherein the profile comprises informatioD on a fi^equency 

2 with whidi the at least one compute is connected to the netwodc 

1 32. TTie system of claim 24, wherein the profile comprises information on whidi of the 

2 plurality of computers are currently connected to the network. 

1 33. The system of claim 23, herein the monitor is a computer within the nettvoik, and 

2 fiirther wherein at least a portion of the measured data is collected by monitoring data 

3 exdianges which travel dirough Ae monitor as they are transmitted through the ndwork. 

1 34. The system of claim 23, fiirther comprising: 

2 amemorywhidi is a compute* within the network, and which collects content data 

3 composing information on the content available for searching on the plurality of 

4 compute by monitc^g data exchanges whidh travel through the memory as they are 

5 transmitted through fiie network. 

1 35. Hie system of claim 34, wherein the memory removes the content data after a 

2 predi^ermined period of time, 

3 and fiirther wherein the memory sends common user search quoies into the 

4 network on a periodic basis and stores the results. 

1 36. . The syston of daim 35, wherein a portion of the removed content data which 

2 identifies a type of file available for seardiing on the plurality of computers is separately 

3 stored, 

4 and fiirther wfaoein the selector selects the selected computer based at least in part 

5 on the stored file-type content data. 

1 37. The system of claim 34, wherein the memory chooses a portion of the content data 

2 to store based on previous user requests. 

1 38. The system of claim 34, herein the monitor monitors a current connectivity status 

2 of each of the plurality of computors. 
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3 and further wherein the selector selects the selected computer based on the cont«it 

4 data and the current connectivity status. 

1 39. The system of claim 34, wherein the monitor collects a plurality of statistical 

2 measures which characterize eadi of the plurality of computofs, 

3 and further wherein the selector assigns a weighted score to eadi of the statistical 

4 measures and combines the weighted scores to thereby rank the plurality of computers 

5 accordingly^ and thereafter selects the at least one computer based on the content data, the 

6 current coimectivity status and the ranks. 

1 40. The system of daim 23, wherein the selector selects at least a second selected 

2 computer based on the data, and fiirther whwein the selector outputs a network location of 

3 die second selected computer to the user after a predetermined period of time, or in 

4 response to a user request 

1 41. The system of claim 24, ^nlierein ftie profflers are located at a plurality of 

2 geographical locations vAddi are remote from one another, wherein the geographical 

3 locations are selected based on Aeir respective proximity to a plurality of users* 

1 42, ' The system of claim 23, wherein the monitor and selector are located on a usct 

2 computer. 

1 43. The system of claim 34, wherein the memory is located on a user computer. 

2 ' predetomined number of the plurality of computers at a first predetemiined time interval, 

3 and the host selector ranks the computers accordingly, and further wh^ein the host 

4 nK>nitor retains a set of hub computers ^i^ch make up a pred^ermined percentage of the 

5 most highly-ranked computers, and thoreafler collects data about only the set of hub 

6 computers at a second predetemiined time interval whidi is smaller than the first 

7 pred^CTnined time interval. 
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1 45. A computer program product for enabling a processor in a computer system to 

2 implement a system for optimally connecting to a peer-to-peer computer network, said 

3 compute program product comprising: 

4 a compute usable medium having computer readable program code means 

5 embodied in said medium for causing a program to execute on the computer system, said 

6 computer readable program code means comprising: 

7 means for collecting data about a plurality of computes within the networic, 

8 including a network location of eadi of the plurality of computers; 

9 means for selecting at least one computer to be a selected computer, based on the 

10 coUected data; and 

1 1 means for muting seardi queries fiom a user to the selected computer. 

1 46. The conq>uter program product of claim 45, wherein said means for collecting data 

2 about a plurality of computers within the network further comprises: 

3 means for sending a signal to at least one of the plurality of computers; 

4 means for receiving the signal upon its return fit>m the at least (me computer; and 

5 means for forming a profile characterizing the at least one computo-, based on 

6 information provided by the signal. 

1 47. The computer program product of claim 45, wherein said means for collecting data 

2 about a plurality of computers within the network further comprises: 

3 means for coUecting a plurality of statistical measures which characterize each of 

4 tfie plurality of compute, 

5 and wherein said means for selecting the selected computer based on the collected 

6 data further comprises: 

7 means for assigning a weighted score to each statistical measure for each of the 

8 plurality of computers; 

9 means for combining the weighted scores to obtain a rank for each of the plurality 

10 of computers; and 

1 1 means for ranking the plurality of computers according to the resulting ranks. 
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1 48. The computer program product of claim 45, wherein said means for collecting data 

2 about a phnality of computers within the network further comprises: 

3 means for monitoring data exchanges whidi occur between the plurality of 

4 computers. 

1 49. The computer program product of claim 48, fiuther comprising: 

2 means for storing the collected data in a memory, wherein at least a pMtion of the 

3 collected data is .content data which comprises information on the content available for 

4 searching on the plurality of computers. 

1 50. The computa: program product of claim 49, iurthCT comprising: 

2 means fior removing the content data after a predetermined period of time; 

3 means for sending common user seardi queries into the n^work on a periodic 

4 basis; and 

5 means for storing the results in the memory. 

1 51. The computer program product of claim 49, wherein said means for storing the 

2 content data in a memory comprises: 

3 means for choosing a portion of the contCTt data to store based on previous user 

4 requests. 

1 52. The computer program product of claim 49, herein said means for collecting data 

2 about a pturality of computers within the network further comprises: 

3 means for monitoring a cuxreot connectivity status of each of the plurality of 

4 computers, 

5 and wherein said means for selecting at least one computer to be a selected 

6 computer based on the collected data further comprises: 

7 means for selecting tihe selected computer based on the content data and the current 

8 connectivity status. 

1 53. The computer program product of claim 45, wherein said means for coUecting data 

2 about a plurality of computers within the network further comprises: 
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3 means for collecting a plurality of statistical measures which characterize each of 

4 the plurality of computers, 

5 and wherein said means for selecting the selected computer based on the collected 

6 data further comprises: 

7 means for assigning a weighted score to each statistical measure for each of the 

8 plurality of computers; 

9 means for combining the weighted scores to obtain a rank for each of the plurality 

10 of computers; 

1 1 means for ranking the plurality of computers according to the resulting ranks; and 

12 means for selecting the at least one computer based on the content data, the current 

1 3 coimedivity status and the ranks. 

1 54. The computer program product of claim 45, further comprising a plurality of 



2 means for sending the signal ftom a plurality of geographical locations which are remote 

3 fiom one another, \dierein the geographical locations are selected based on their 

4 respective proximity to a plurality of users. 



1 55. A method for optimizing a computer's access to information, the method 

2 comprising: 

3 maintaining a first database whidi includes status information about computers 

4 within the network; 

5 maintaining a secoiKl database which includes content information about the 

6 computers within the network; 

7 filtOTng Ae contents of the second database using the contents of the first database, 

8 at a time of a user request for information; and 

9 accessing at least one computer within the network based on the filtered contents 
10 of the second database. 

1 56. The method of claim 55, wherein said maintaining a first database which includes 

2 status information about computers within the network firrther comprises: 
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3 updating the status infoimation periodicaUy, so that the status infonnation is 

4 . approximately current in time. 

1 57. The method of claim 55, wherein said maintaining a second database which 

2 includes content infoimation about the computers within the network fiirther comprises: 

3 intercepting exchanges between the computCTS within the network. 

1 58. The method ofdaim 55, wherein said filtering the contents ofthe second database 

2 using the contents of the first database furthCT comprises: 

3 identifying computers in the networic whidi are least likely to provide infomiadon 

4 desired by the user, based on the status infOTmation; 

5 • removing the content infoimation from the second database whidi is stored on tiie 

6 identified conqiuters. 

1 59. The mediod of daim 58, wherrin Ae status information inchides a fiequency wifli 

2 \niiidiflieaMDputerswifljin the network are connected to the network. 

1 60. The method of claim 58, wherein the status infonnation indudes a current 

2 connectivity status of the computers widun the n^oric 

1 61. The method of claim 57, wherein the status infonnation includes a download 

2 capability of die onnputers within the network. 

2 maintaining a flmd database wiridi includes contmt infonnation about Ihe 

3 computers within the network which identifies the types of files available for searching on 

4 the computers witfun the network; 

5 filtering the contents of the third database using the contents of the first database, 

6 at a time of a user request for information; and 

7 accessing at least one compute within the network based on the filtered contaits 

8 of the tturd database. 
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